Building Russian Word Sketches as Models of Phrases

نویسنده

  • Maria Khokhlova
چکیده

The paper describes the writing of Sketch Grammar for the Russian language as a part of the Sketch Engine system. The Sketch Engine representing itself a corpus tool which takes as input a corpus of any language and corresponding grammar patterns. The system gives information about a word’s collocability on concrete dependency models, and generates lists of the most frequent phrases for a given word based on appropriate models. The papers deals with different approaches to writing rules for the grammar, based on morphological and syntactic information, and also with applying word sketches to the Russian language. The results show that word sketches and information about collocation behaviour could facilitate lexicographic work with the Russian language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying Word Sketches to Russian

The paper describes work on writing a Russian Sketch grammar for the system Sketch Engine. The objective of such a system is to provide lexicographers with sufficient lexical material and tools for getting information about a word’s collocability and to generate lists of the most frequent phrases for a given word, and then to classify them for appropriate syntactic models. The system will give ...

متن کامل

Studying Word Sketches for Russian

Without any doubt corpora are vital tools for linguistic studies and solution for applied tasks. Although corpora opportunities are very useful, there is a need of another kind of software for further improvement of linguistic research as it is impossible to process huge amount of linguistic data manually. The Sketch Engine representing itself a corpus tool which takes as input a corpus of any ...

متن کامل

Legal Terms and Word Sketches: A Case Study

In this paper we describe an approach to the semiautomatic identification of legal terms in Czech texts. Our general goal is to offer supplementary tools for building dictionary of Czech law terms. At first we used the VaDis partial parser for recognition of the complex nominal constructions in a legal text – the current version of the Penal Code of the Czech Republic. Headwords of the recogniz...

متن کامل

Goal-Source Asymmetry and Russian Spatial Prefixes

In this paper, I draw on data from Russian to argue for an asymmetry between Goal and Source prepositional phrases. Source prepositional phrases are structurally ambiguous; they can occur both as arguments and adjuncts in certain syntactic contexts. Goal prepositional phrases are unambiguously arguments. I claim that Source prepositions have lexically specified semantics, which determines their...

متن کامل

Russian Named Entities Recognition and Classification Using Distributed Word and Phrase Representations

The paper presents results on Russian named entities classification and equivalent named entities retrieval using word and phrase representations. It is shown that a word or an expression’s context vector is an efficient feature to be used for predicting the type of a named entity. Distributed word representations are now claimed (and on a reasonable basis) to be one of the most promising distr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010